A First Portfolio





Kerry Back

  • Buy stocks of profitable companies that aren’t too expensive. Avoid large caps and penny stocks.
    • Profitable = high income to book equity (ROE)
    • Cost = price to book equity per share (PB)
    • Large caps = largest 500 stocks by market cap
    • Penny stocks = less than $5
  • This strategy has not been back-tested!

Data

Connect to the SQL server in python

import pandas as pd
import pymssql
from sqlalchemy import create_engine

server = 'fs.rice.edu'
database = 'stocks'
username = 'stocks'
password = '6LAZH1'

string = "mssql+pymssql://" + username + ":" + password + "@" + server + "/" + database 
conn = create_engine(string).connect()

SQL queries

SELECT column_name(s)
FROM table_name
JOIN table_name ON column_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

Grab some data from quarterly reports

quarterly = pd.read_sql(
    """
    select datekey, reportperiod, ticker, netinc, equity
    from sf1
    where dimension='ARQ' and equity>0
    order by ticker, datekey
    """,
    conn
)
quarterly = quarterly.dropna()

Define ROE as net income / beginning of quarter equity

quarterly["equitylag"] = quarterly.groupby("ticker").equity.shift()
quarterly["roeq"] = quarterly.netinc / quarterly.equitylag

# save last report for each firm
quarterly = quarterly.groupby("ticker").last()
quarterly = quarterly[quarterly.reportperiod.astype(str)>="2022-06-01"]

# drop variables other than roeq and ticker (ticker=index)
quarterly = quarterly[["roeq"]]

Grab market cap, and PB from a different table

daily1 = pd.read_sql(
    """ 
    select date, ticker, pb, marketcap
    from daily 
    where date>='2023-01-01'
    order by ticker, date
    """,
    conn
)
daily1 = daily1.dropna()
daily1 = daily1.groupby("ticker").last()

Grab price from yet another table

daily2 = pd.read_sql(
    """ 
    select date, ticker, close_ as price
    from sep 
    where date>='2023-01-01'
    order by ticker, date
    """,
    conn
)
daily2 = daily2.dropna()
daily2 = daily2.groupby("ticker").last()

Merge data sets

df = pd.concat((quarterly, daily1, daily2), axis=1)
df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 3606 entries, A to TPG
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   roe        3352 non-null   float64
 1   date       3540 non-null   object 
 2   pb         3540 non-null   float64
 3   marketcap  3540 non-null   float64
 4   date       3541 non-null   object 
 5   price      3541 non-null   float64
dtypes: float64(4), object(2)
memory usage: 197.2+ KB

Drop penny stocks

df = df[df.price>5]
df.shape
(2579, 6)

Drop large caps

df["size_rank"] = df.marketcap.rank(ascending=False)
df = df[df.size_rank > 500]

Find profitable stocks that aren’t too expensive

filter1 = df.roe >= df.roe.quantile(0.7)
filter2 = df.pb <= df.pb.quantile(0.3)
df = df[filter1 & filter2].copy()
ticks = df.index.to_list()
len(ticks)
83

Create alpaca account

  • After logging in to Alpaca, you DO NOT need to enter name, address, etc. - that is only for real trading accounts.
  • By clicking the Overview icon on the left toolbar, you should get to the screen on the following slide.
  • Click on API Keys, then Regenerate to see your key and your secret key. Copy and save them somewhere.

Import alpaca

  • Install alpaca-py. Then
from alpaca.data import StockHistoricalDataClient
from alpaca.trading.client import TradingClient
from alpaca.data.requests import StockLatestQuoteRequest
from alpaca.trading.requests import MarketOrderRequest
from alpaca.trading.enums import OrderSide, TimeInForce

Connect to alpaca

KEY = "your_key"
SECRET_KEY = "your_secret_key"

data_client = StockHistoricalDataClient(KEY, SECRET_KEY)
trading_client = TradingClient(KEY, SECRET_KEY, paper=True)

Get current quotes

params = StockLatestQuoteRequest(symbol_or_symbols=ticks)
quotes = data_client.get_stock_latest_quote(params)

# convert to dataframe

sers = [pd.Series(dict(x)) for x in quotes.values()]
df = pd.concat(sers, axis=1)
df.columns = [x for x in quotes]
df = df.T
df = df.drop(columns=["symbol"])
df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 83 entries, SCVL to ARW
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype              
---  ------        --------------  -----              
 0   timestamp     83 non-null     datetime64[ns, UTC]
 1   ask_exchange  83 non-null     object             
 2   ask_price     83 non-null     object             
 3   ask_size      83 non-null     object             
 4   bid_exchange  83 non-null     object             
 5   bid_price     83 non-null     object             
 6   bid_size      83 non-null     object             
 7   conditions    83 non-null     object             
 8   tape          83 non-null     object             
dtypes: datetime64[ns, UTC](1), object(8)
memory usage: 6.5+ KB

Compute trades

df["trade"] = df.ask_price.apply(lambda p: int(1000/p) if p !=0 else 0)
ticks = df[df.trade>0].index.to_list()

Send trades

for tick in ticks: 
    market_order_data = MarketOrderRequest(
        symbol=tick,
        qty=df.trade.loc[tick],
        side=OrderSide.BUY,
        time_in_force=TimeInForce.DAY
    )
    market_order = trading_client.submit_order(
        order_data=market_order_data
    )